
This is the replication package for the extended abstract `Linguistic Similarity Within Centralized FLOSS Development`. 

If any information is missing, or if you have any questions, please reach out to the first-author at gaughan@u.northwestern.edu

All relevant code and data for this extended abstract are compressed into `mw_sim_replication.tar.gz`

When the code and data are decompressed, there are three subdirectories to this replication package: 
	data/ contains all data files used in our analysis (Phabricator comments and commit logs)
		NOTE: data/phabricator/ contains a `README.txt' that traces the provenance of our final data set through different merges, cleans, and augmentations  
		data/data_collection/ contains all scripts that we used to collect and clean our Phabricator and commit data

	linguistic_analyses/ contains all scripts used in comparing linguistic style between contributor populations 

	figures/
		Figure 1 is `figures/012026_nowikia_ve_commits_created.png' which is generated by `figures/VE_commits_plotting.R' 
		Figure 2 is `figures/030326_adac_pc3pc4_affil_style.png' which is generated by `figures/pca_plotting.R'
